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Abstract 

In this digital world, the data is wealth, this data is analyzed, developed and applied to specific applications 
using some well-developed algorithms known as Machine learning (ML). Machine learning algorithms are 
supervised, unsupervised, semi-supervised and reinforcement types. Deep learning (DL) is also a method of 
analyzing the data on a large scale. Deep learning is a further subdivision of The subsection of ML is deep 
learning and measures a particular type of learning that involves the use of artificial neural networks (ANN). 
This paper provides group of different machine learning terminologies for quick reference. This study is 
important to focus on different machine learning techniques and their connection in various real-world 
applications such as smart cities, cyber security, healthcare, agriculture and intelligent transportation 
systems. In this paper, machine learning concepts, different types of architectures, the challenges, various real 
world applications are discussed. 

Keywords: Deep Learning (DL), Machine Learning (ML), Supervised, Unsupervised, Semi-supervised, 
Reinforcement, ANN, RvNN, RNN, CNN. 


1. Introduction 


Machine learning is a procedure for a computing 
machine which recklessly upgrading with the 
occurrence and execute the learning procedure. In 
recent era, machine learning is a brook in AI that is 
obtaining demand in computing and information 
study, so the applications act wisely [2]. 
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Figure 1 Relation between Artificial 
Intelligence, Machine Learning, and Deep 
Learning 


This technology is very useful for the improvement 
of the system, which allows the application to study 
from the experience not from a particular program. 
That is why machine learning algorithms are very 
important for the growth of real time applications 
for various existent world problems by observing 
the data wisely [2]. Relationship between AI, Deep 
Learning and Machine Learning shown in Figure 1. 
2. Machine Learning Techniques 

Machine learning algorithms are mainly classified 
into four types (i) Unsupervised Learning (i1) Semi- 
Supervised Learning (iii) Supervised Learning(iv) 
Reinforcement Learning [5]. 

2.1 Unsupervised Learning 

This unsupervised algorithm is used when the data 
is accessible only in the form of input and there are 
no corresponding output parameters. These 
algorithms consist of basic patterns to learn its 
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characteristics [5]. The main types of unsupervised —? 


2016 to till date-Machine Learning is widespread into 
various domains 


learning techniques are clustering, feature learning, 
finding association results etc. [7].In clustering 
technique, essential categories in the data are found 
and then used to forecast output for hidden inputs. 
Figures 2, 2a, 2b shows the machine learning 
techniques. 


Google AlphaGo program that beat an unhandicapped 
professional human player 


$$ _____________________-@ 


IBM Watson Defeats two human challengers at Jeopardy 


Andrew Ng & Jeff Dean create a NN that learns to 
recognise cats by watching unlabelled images 


Yim Kam Ho creates first algorithm for = Decision 
forests ti enable better prediction performance 
The Lanch of Kaggle-website for ML competitions 
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An ANN, R language and WEKA released 


in these years 
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Arthur Samuel, World first self learning program 
developed 

James William Cooley & John Tukey 
Codeveloped first fourier Transform 


: . . . Figure 2a Machine Learning Time Line 
Figure 2 Overview of Unsupervised Learning 


Input Samples are grouped into Clusters Based 
on the Basic Patterns [2]. 


——— 


Figure 2b Machine Learning Techniques 
2.2 Semi-Supervised Learning 
Semi-supervised learning is interposed between 


supervised and unsupervised learning methods. O @ o a @ a 
These techniques instructed utilizing a mixture of A A 

tagged and untagged data. Normally in a usual @ s- _ | A 
position, there is a small quantity of tagged data and fl A C 7 A a. vA 
a huge quantity of untagged data. A fundamental 

method is implicated is that initially alike data is EB LI ia a B Hi 


clustered by using an unsupervised learning 
technique and later living tagged data is used to tag 
the remaining of the untagged data [23]. 
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Figure 3 Overview of Semi-Supervised Learning 
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The clusters formed by a large amount of untagged 
data are used to classify a limited amount of tagged 
data. Figure 3 shows a survey of semi-supervised 
learning. The clusters are found by a large amount 
of untagged data are used to allot a finite quantity of 
tagged data. 
2.3 Supervised Learning 

If the information is shaped in input parameters and 
output target parameters, then supervised learning 
technique is used. This technique grasp the mapping 
function from the input to the output. The 
availability of huge scale data samples makes it as 
costly approach for the works where the data is 
short. Figure 4 shows supervised learning 
approaches can be widely classified into two major 
groups. 

2.3.1 Classification 
The output parameter is one of some known number 
of categories. For example, “dog” or’’rat’”’,” positive” 
or “negative”. 

2.3.2 Regression 
The output parameter is a real or a continuous value. 
For example, “Daily temperature of a 


99 99. 99°99 


place”,”’price”,” geographical location” [24]. 


Figure 4 Overview of Supervised Learning 
Input Examples are Categorized into a Known 
Set of Classes 

2.4 Reinforcement Learning. 

The Figure 5 shows Reinforcement learning method 
is used when the task at hand is to make a sequence 
of decisions towards a final reward. During the 
study process, an artificial agent gets either rewards 
or penalties for the actions, it performs. Its goal is to 
maximize the total reward. Examples include teach 
agents to play computer games or performing 
robotics tasks with end goal [15]. 


Action 


Environment 


Reward 


Figure 5 An Overview of Reinforcement 
Learning an Agent Observes the Environment 
State and Performs Actions to Maximize an 
Overall Reward 

3. Deep Learning 
Deep learning is a process of Machine learning. 
Among the different ML algorithms, deep learning 
(DL) is very commonly employed in different 
applications. DL is also known as representation 
learning (RL) [6]. The popular types of deep 
learning networks are recursive neural networks 
(RvVNNs), recurrent neural networks (RNNs) and 
convolutional neural networks (CNNs). 

3.1 Recursive neural networks 
RvNN can be used to predict the structure using 
compositional vectors [6]. The RvNN is designed 
for processing objects like graphs, trees etc. from 
various ways. These variable-size recursive-data 
structures are represented with a fixed width using a 
BTS learning (back-propagation through structure) 
[26]. The BTS system is a general-back propagation 
algorithm and it supports a tree like structure. RVNN 
calculates a likely pair of scores for merging and 
construction of a tree. Next, the pair with the largest 
score is merged within a composition vector. 
Following every merge, RVNN creates a larger area 
of many units, a compositional vector of the area, 
and a label for the class. The root of the RVNN tree 
structure is the compositional vector for the entire 
area. An example RvNN tree is shown in Figure 6. 
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Semantic representation for each region 


An example of RvNN tree 


Figure 6 An Example of Rvnn Tree 
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3.2 Recurrent neural networks 
One of the frequently employed and easy algorithm 
in deep learning is RNNs [6, 9&10]. The main 
application of RNN is in the area of speech 
processing [29, 30]. RNN uses sequential data in the 
network. So, it is considered as a unit of short-term 
memory. There are input layer, output layer and 
hidden layer in RNN. For a given input sequence, a 
typical unfolded RNN diagram is illustrated in 
Figure 7. RNN mainly based on three techniques, 
those are “Hidden-to-Hidden’”, ““Hidden-to-Output”, 


and “Input-to-Hidden” [3]. 
tayo eo @ 
ry 


Layer 


Hidden - 
Layer 


Typical unfolded RNN diagram 
igure 7 Typical Unfolded RNN Diagram 
3.3 Convolutional neural networks 
The most well-known and frequently employed 
algorithm in the area of deep earning is CNN. [6, 32, 
33-37]. It is a powerful technique than RNN, which 
has less feature similarity compared with CNN. The 
main benefit of CNN is automatic identification of 
the relevant features without any human supervision 
[12]. CNNs can be applied in different fields, like 
Face Recognition [41], speech processing [4], 
computer vision [39], etc. The three important 
advantages identified by Good fellow et al. [13] are 
equivalent representations, sparse interactions, and 
parameter sharing. CNN are employed to make full 
use of 2D input-data structures like image signals. 
This operation uses small number of parameters, 
which simplifies the training process and speeds up 
the network. A commonly used type of CNN is 
multi-layer perceptron (MLP). It consists of many 
convolution layers foregoing  sub-sampling 
(pooling) layers, while the ending layers are FC 
layers. An example of CNN architecture for image 
classification is illustrated in Figure 8. 


Dog 


“4 Not Dog 
Input image Convolution Layer ReLU Layer Pooling Layer \ / Output 
| Classes 
Fully Connected 
Layer 


Figure 8 An Example of CNN Architecture for 
Image Classification 
4. Algorithms 
4.1 Introduction to Mainstream Supervised 
Learning Algorithms 

4.1.1 Decision Tree 
The decision tree mainly contains internal node, 
branch and leaf node. The internal node performs a 
test on an attribute, every branch gives the result of 
the test and every leaf node performs a class label. 
Decision tree structure used to replace and 
recognize the decision making criteria of a given 
issues [1]. 

4.1.2 Random Forests 
An important object learning method for 
classification and regression based on bagging is 
Random forest. It is operated by the decision of 
individual trees, by considering unlabelled samples 
at the input and the results of classification at the 
output. The performance bottleneck problem faced 
by the decision tree is resolved by joining the 
bagging method to the decision tree. 

4.1.3 Naive Bayes 
It is an easy and finest method of Bayesian 
algorithms. The result of every attribute on its target 
parameter of the given allocation is unconstrained. 
The naive Bayes algorithm works based on Bayes 
Theorem: 

4.1.4 Support Vector Machines (SVM) 
By nonlinear transformation, The SVM algorithm 
maps the input space to a high dimensional feature 
space. The experimental risk is minimum if the 
given data set is linearly separable. Thus, while 
finding an optimal boundary plane, it separates the 
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data into two categories and also increases the 
classification duration between them. The support 
vector machine will have good induction ability if 
there is more classification gap. 

4.2 Main Unsupervised Learning Algorithms 

[16] 
4.2.1 The K-means Algorithm 

K-Means algorithm is a famous for cluster study, 
introduced by Macqueen. The design of K-Means 
algorithm is as follows: Given n data points {x1, 
x2,..., xn}, K cluster centres {al, a2,..., ak} is found 
so that the square sum of distance between each data 
point and its closest cluster centre is the smallest, 
and the square sum of the distance is called the 
objective function Wn whose mathematical 
expression is[8]: 


Ww, = mMiny<j<x|xX:i — a;|?” 
n 2 1<j<k|X a;| 
The processing of its algorithm is: 

e If there are K sample points in the sample 
dataset D, and then values of these sample 
points are assigned to the initial clustering 
centre (mi ) (i=1,....k). 

e The distance between each sample point from 
pj (i=1, n) and its clustering centre (mi) 
calculated: 


d G,j=y pj — mr|2 
e The minimum distance min (d (i, j)) between 
pj and (mi) is is found, and place pj in the 
cluster that is closest to (mi) . 
e The clustering centre of each cluster again is 
calculate: 


e The squared difference E (t) of all points in 
dataset D is calculated according to step one 
and compared with the previous error E (t-1). 


e E (t-1) is observed if below zero, otherwise 
the algorithm ends. 
Due to the advantages of K-means algorithm like time 
efficient and easy to describe, this algorithm is well 
suited for large scale data processing. 


4.3 Introduction to Main Semi-supervised 
Learning Algorithms 
4.3.1 The Self-training Algorithm Based on 
the K-nearest Neighbour 

The algorithm which is simple and uses a training 
set to split the feature space into different regions is 
called K-nearest neighbour algorithm for supervised 
learning and in this each sample is busy in certain 


region. A_ self-training K-nearest neighbour 
algorithm contains no training set. 
4.3.2 The  Semi-supervised Learning 


Algorithm Based on Divergence 
The semi-supervised learning method is based on 
the divergence, actually begins with the below 
process. The classifier is used to classify and label 
the unlabelled test samples. Then these are attached 
to the training set of the classifier. This process is 
continuously happened till all the labelling is done. 
This easy and successful technique has relatively 
careful theoretical basis, detecting a wide range of 
applications. 
5. Real-world Applications 
Machine Learning Applications Machine learning 
methods are very famous in industry4.0, because it 
takes intelligent decisions and it has the capability 
to grasp from the old. 

5.1 A Wise Decision Making and Predictive 

Analytics 

In this, the widely used techniques are SVM, 
decision trees and ANN for inventory management, 
avoidance of out of stock situations, behaviour and 
preferences of customer etc. [2,14]. For example, 
credit card fraud detection and criminal detection 
after a crime situation.it will be helpful for an 
organization like healthcare, financial services, 
transportation, sales and marketing, 
telecommunication, e-commerce, social networking 
etc. to predict the result of outcome accurately. 

5.2 Cyber-Security and Threat Intelligence 
The duty of cyber security are safeguarding the data, 
systems, hardware and networks [2,15]. One of the 
important technologies of cyber-security is machine 
learning, it gives protection by securing information 
while browsing keeps people safe, are identified and 
adware is discovered in the traffic. 
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5.3 Sustainable Agriculture 
Sustainable Agriculture methods like mobile 
technologies, IOT and mobile devices are helpful to 
increase the agriculture results while reducing the 
negative environmental consequences [2, 58-60]. 
Different techniques of machine learning are 
applied in production planning, weed detection, soil 
nutrient management, weather prediction, inventory 
management, soil properties, etc. [2, 61]. 

5.4 Natural Language Processing 
DNN methods are helpful than traditional natural 
processing techniques. Because these techniques 
separately process problems, such as language 
models and semantically related words, and there is 
no overall processing. In the year 2008, Colbert 
began to apply DNN method for the natural 
language processing and generating an error rate of 
14.3% [11]. 
Conclusion 
In this, article on machine learning comprehensive 
review is directed for a wise data analysis and real 
time applications. Various machine learning 
algorithms are discussed for intelligent data 
analysis. Next, Machine learning techniques, 
algorithms and deep learning are discussed. To 
generate intelligent decision-making, machine 
learning algorithms need to be familiarize with 
target application knowledge and trained with data 
collected from various real-world situations. At last, 
Challenges and future directions are discussed and 
summarised. This study works as a benchmark for 
the decision makers on a variety of application 
domains and_ various real-world _ situations. 
Preferably, it is spreading across a wide range of 
industries, including banking and _ finance, 
information technology, media and entertainment, 
gaming, and the automobile sector. 
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